Porting a sphere optimization program from LAPACK to ScaLAPACK

نویسندگان

  • Paul C. Leopardi
  • Robert S. Womersley
چکیده

The sphere optimization program sphopt was originally written as a sequential program using lapack, and was converted to use scalapack, primarily to overcome memory limitations. The conversion was relatively straightforward, using a small number of organizing principles which are widely applicaple to the scalapack parallelization of serial code. The main innovation is the use of a compressed block cyclic storage scheme to store two symmetric matrices in little more than the storage needed for one matrix. The resulting psphopt program scales at least as well as the scalapack Cholesky decomposition routine which it uses.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SNAP (Small-World Network Analysis and Partitioning) Framework

Discussion Both LAPACK and ScaLAPACK libraries contain routines for solving systems of linear equations, least squares problems, and eigenvalue problems. The goals of both projects are efficiency (to run as fast as possible), scalability (as the problem size and number of processors grow), reliability (including error bounds), portability (across all important parallel machines), flexibility (s...

متن کامل

PyACTS: A High-Level Framework for Fast Development of High Performance Applications

Software reusability has proven to be an effective practice to speed-up the development of complex high-performance scientific and engineering applications. We promote the reuse of high quality software and general purpose libraries through the Advance CompuTational Software (ACTS) Collection. ACTS tools have continued to provide solutions to many of today’s computational problems. In addition,...

متن کامل

LAPACK Working Note # 216 : A novel parallel QR algorithm for hybrid distributed memory HPC systems ∗

A novel variant of the parallel QR algorithm for solving dense nonsymmetric eigenvalue problems on hybrid distributed high performance computing (HPC) systems is presented. For this purpose, we introduce the concept of multi-window bulge chain chasing and parallelize aggressive early deflation. The multi-window approach ensures that most computations when chasing chains of bulges are performed ...

متن کامل

LAPACK Working Note 58 The Design of Linear Algebra Libraries for High Performance Computers

This paper discusses the design of linear algebra libraries for high performance computers. Particular emphasis is placed on the development of scalable algorithms for MIMD distributed memory concurrent computers. A brief description of the EISPACK, LINPACK, and LAPACK libraries is given, followed by an outline of ScaLAPACK, which is a distributed memory version of LAPACK currently under develo...

متن کامل

Implementing Communication-optimal Parallel and Sequential Qr Factorizations

We present parallel and sequential dense QR factorization algorithms for tall and skinny matrices and general rectangular matrices that both minimize communication, and are as stable as Householder QR. The sequential and parallel algorithms for tall and skinny matrices lead to significant speedups in practice over some of the existing algorithms, including LAPACK and ScaLAPACK, for example up t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008